Overview

Dataset statistics

Number of variables16
Number of observations8359
Missing cells24088
Missing cells (%)18.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.0 MiB
Average record size in memory128.0 B

Variable types

NUM9
CAT7

Warnings

Name has a high cardinality: 6231 distinct values High cardinality
Publisher has a high cardinality: 295 distinct values High cardinality
User_Score has a high cardinality: 88 distinct values High cardinality
Developer has a high cardinality: 1126 distinct values High cardinality
Global_Sales is highly correlated with NA_Sales and 1 other fieldsHigh correlation
NA_Sales is highly correlated with Global_SalesHigh correlation
EU_Sales is highly correlated with Global_SalesHigh correlation
Year_of_Release has 84 (1.0%) missing values Missing
Critic_Score has 4383 (52.4%) missing values Missing
Critic_Count has 4383 (52.4%) missing values Missing
User_Score has 3528 (42.2%) missing values Missing
User_Count has 4660 (55.7%) missing values Missing
Developer has 3489 (41.7%) missing values Missing
Rating has 3561 (42.6%) missing values Missing
Other_Sales is highly skewed (γ1 = 24.74795541) Skewed
Name is uniformly distributed Uniform
NA_Sales has 2311 (27.6%) zeros Zeros
EU_Sales has 3002 (35.9%) zeros Zeros
JP_Sales has 4807 (57.5%) zeros Zeros
Other_Sales has 3218 (38.5%) zeros Zeros

Reproduction

Analysis started2020-12-05 17:19:27.274453
Analysis finished2020-12-05 17:19:56.032107
Duration28.76 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

Name
Categorical

HIGH CARDINALITY
UNIFORM

Distinct6231
Distinct (%)74.5%
Missing0
Missing (%)0.0%
Memory size65.3 KiB
Ratatouille
 
9
LEGO Marvel Super Heroes
 
9
LEGO Jurassic World
 
8
Lego Batman 3: Beyond Gotham
 
8
Cars
 
8
Other values (6226)
8317 
ValueCountFrequency (%) 
Ratatouille90.1%
 
LEGO Marvel Super Heroes90.1%
 
LEGO Jurassic World80.1%
 
Lego Batman 3: Beyond Gotham80.1%
 
Cars80.1%
 
LEGO The Hobbit80.1%
 
LEGO Harry Potter: Years 5-780.1%
 
The LEGO Movie Videogame80.1%
 
LEGO Star Wars II: The Original Trilogy70.1%
 
Star Wars The Clone Wars: Republic Heroes70.1%
 
Other values (6221)827999.0%
 
2020-12-05T17:19:56.248851image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique5015 ?
Unique (%)60.0%
2020-12-05T17:19:56.482552image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length132
Median length22
Mean length23.98265343
Min length2

Platform
Categorical

Distinct31
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size65.3 KiB
DS
1106 
PS2
1104 
Wii
645 
PS3
643 
PSP
642 
Other values (26)
4219 
ValueCountFrequency (%) 
DS110613.2%
 
PS2110413.2%
 
Wii6457.7%
 
PS36437.7%
 
PSP6427.7%
 
X3605887.0%
 
PS5126.1%
 
GBA4455.3%
 
PC4395.3%
 
XB3714.4%
 
Other values (21)186422.3%
 
2020-12-05T17:19:56.712311image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique4 ?
Unique (%)< 0.1%
2020-12-05T17:19:56.906563image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length4
Median length3
Mean length2.786577342
Min length2

Year_of_Release
Real number (ℝ≥0)

MISSING

Distinct38
Distinct (%)0.5%
Missing84
Missing (%)1.0%
Infinite0
Infinite (%)0.0%
Mean2006.393716
Minimum1980
Maximum2017
Zeros0
Zeros (%)0.0%
Memory size65.3 KiB
2020-12-05T17:19:57.136134image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1980
5-th percentile1995
Q12003
median2007
Q32010
95-th percentile2015
Maximum2017
Range37
Interquartile range (IQR)7

Descriptive statistics

Standard deviation6.099620895
Coefficient of variation (CV)0.003040091706
Kurtosis2.066641924
Mean2006.393716
Median Absolute Deviation (MAD)4
Skewness-1.100803253
Sum16602908
Variance37.20537506
MonotocityNot monotonic
2020-12-05T17:19:57.331057image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%) 
20087358.8%
 
20096968.3%
 
20106457.7%
 
20076257.5%
 
20115556.6%
 
20065106.1%
 
20054715.6%
 
20034195.0%
 
20044044.8%
 
20023754.5%
 
Other values (28)284034.0%
 
ValueCountFrequency (%) 
19804< 0.1%
 
1981340.4%
 
1982230.3%
 
1983140.2%
 
198490.1%
 
ValueCountFrequency (%) 
20173< 0.1%
 
20162603.1%
 
20153043.6%
 
20142843.4%
 
20132793.3%
 

Genre
Categorical

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size65.3 KiB
Action
1743 
Role-Playing
912 
Misc
905 
Sports
879 
Adventure
741 
Other values (7)
3179 
ValueCountFrequency (%) 
Action174320.9%
 
Role-Playing91210.9%
 
Misc90510.8%
 
Sports87910.5%
 
Adventure7418.9%
 
Shooter5847.0%
 
Platform5656.8%
 
Racing5256.3%
 
Fighting4405.3%
 
Strategy3904.7%
 
Other values (2)6758.1%
 
2020-12-05T17:19:57.547575image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-05T17:19:57.850945image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length12
Median length6
Mean length7.273238426
Min length4

Publisher
Categorical

HIGH CARDINALITY

Distinct295
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Memory size65.3 KiB
THQ
715 
Nintendo
706 
Sony Computer Entertainment
687 
Sega
638 
Take-Two Interactive
 
422
Other values (290)
5191 
ValueCountFrequency (%) 
THQ7158.6%
 
Nintendo7068.4%
 
Sony Computer Entertainment6878.2%
 
Sega6387.6%
 
Take-Two Interactive4225.0%
 
Capcom3864.6%
 
Atari3674.4%
 
Tecmo Koei3484.2%
 
Warner Bros. Interactive Entertainment2352.8%
 
Square Enix2342.8%
 
Other values (285)362143.3%
 
2020-12-05T17:19:58.213971image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique93 ?
Unique (%)1.1%
2020-12-05T17:19:58.438721image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length38
Median length10
Mean length12.83263548
Min length3

NA_Sales
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct345
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.71994258
Minimum0
Maximum4136
Zeros2311
Zeros (%)27.6%
Memory size65.3 KiB
2020-12-05T17:19:58.638352image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median8
Q325
95-th percentile125
Maximum4136
Range4136
Interquartile range (IQR)25

Descriptive statistics

Standard deviation104.3499353
Coefficient of variation (CV)3.396814139
Kurtosis472.0553993
Mean30.71994258
Median Absolute Deviation (MAD)8
Skewness17.0044509
Sum256788
Variance10888.909
MonotocityNot monotonic
2020-12-05T17:19:59.058818image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0231127.6%
 
22803.3%
 
42623.1%
 
32493.0%
 
72493.0%
 
62473.0%
 
52432.9%
 
82352.8%
 
12332.8%
 
92082.5%
 
Other values (335)384246.0%
 
ValueCountFrequency (%) 
0231127.6%
 
12332.8%
 
22803.3%
 
32493.0%
 
42623.1%
 
ValueCountFrequency (%) 
41361< 0.1%
 
29081< 0.1%
 
26931< 0.1%
 
23201< 0.1%
 
15681< 0.1%
 

EU_Sales
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct254
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.06771145
Minimum0
Maximum2896
Zeros3002
Zeros (%)35.9%
Memory size65.3 KiB
2020-12-05T17:19:59.262855image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q312
95-th percentile67
Maximum2896
Range2896
Interquartile range (IQR)12

Descriptive statistics

Standard deviation60.93694708
Coefficient of variation (CV)3.792509423
Kurtosis689.6999538
Mean16.06771145
Median Absolute Deviation (MAD)2
Skewness19.50701852
Sum134310
Variance3713.311519
MonotocityNot monotonic
2020-12-05T17:19:59.457752image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0300235.9%
 
16978.3%
 
26177.4%
 
34255.1%
 
43293.9%
 
52753.3%
 
62102.5%
 
71752.1%
 
81571.9%
 
91301.6%
 
Other values (244)234228.0%
 
ValueCountFrequency (%) 
0300235.9%
 
16978.3%
 
26177.4%
 
34255.1%
 
43293.9%
 
ValueCountFrequency (%) 
28961< 0.1%
 
12761< 0.1%
 
10951< 0.1%
 
10931< 0.1%
 
9191< 0.1%
 

JP_Sales
Real number (ℝ≥0)

ZEROS

Distinct230
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.30888862
Minimum0
Maximum1022
Zeros4807
Zeros (%)57.5%
Memory size65.3 KiB
2020-12-05T17:19:59.678536image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q36
95-th percentile53
Maximum1022
Range1022
Interquartile range (IQR)6

Descriptive statistics

Standard deviation41.21591524
Coefficient of variation (CV)3.644559303
Kurtosis117.4813734
Mean11.30888862
Median Absolute Deviation (MAD)0
Skewness8.949449848
Sum94531
Variance1698.751669
MonotocityNot monotonic
2020-12-05T17:19:59.921352image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0480757.5%
 
23984.8%
 
13704.4%
 
32813.4%
 
42162.6%
 
51712.0%
 
61611.9%
 
81281.5%
 
71251.5%
 
9881.1%
 
Other values (220)161419.3%
 
ValueCountFrequency (%) 
0480757.5%
 
13704.4%
 
23984.8%
 
32813.4%
 
42162.6%
 
ValueCountFrequency (%) 
10221< 0.1%
 
7201< 0.1%
 
6811< 0.1%
 
6501< 0.1%
 
6041< 0.1%
 

Other_Sales
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct122
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.241057543
Minimum0
Maximum1057
Zeros3218
Zeros (%)38.5%
Memory size65.3 KiB
2020-12-05T17:20:00.175380image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q34
95-th percentile21
Maximum1057
Range1057
Interquartile range (IQR)4

Descriptive statistics

Standard deviation22.94153141
Coefficient of variation (CV)4.377271423
Kurtosis909.5805874
Mean5.241057543
Median Absolute Deviation (MAD)1
Skewness24.74795541
Sum43810
Variance526.3138633
MonotocityNot monotonic
2020-12-05T17:20:00.467024image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0321838.5%
 
1169120.2%
 
28129.7%
 
34735.7%
 
43404.1%
 
52513.0%
 
62012.4%
 
71832.2%
 
81271.5%
 
9901.1%
 
Other values (112)97311.6%
 
ValueCountFrequency (%) 
0321838.5%
 
1169120.2%
 
28129.7%
 
34735.7%
 
43404.1%
 
ValueCountFrequency (%) 
10571< 0.1%
 
8441< 0.1%
 
7531< 0.1%
 
3961< 0.1%
 
3291< 0.1%
 

Global_Sales
Real number (ℝ≥0)

HIGH CORRELATION

Distinct517
Distinct (%)6.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean63.37181481
Minimum1
Maximum8253
Zeros0
Zeros (%)0.0%
Memory size65.3 KiB
2020-12-05T17:20:00.705145image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q16
median18
Q351
95-th percentile238.1
Maximum8253
Range8252
Interquartile range (IQR)45

Descriptive statistics

Standard deviation199.3948557
Coefficient of variation (CV)3.146428051
Kurtosis432.2219434
Mean63.37181481
Median Absolute Deviation (MAD)15
Skewness15.53653482
Sum529725
Variance39758.30849
MonotocityNot monotonic
2020-12-05T17:20:00.999493image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
25486.6%
 
34074.9%
 
13163.8%
 
43143.8%
 
53053.6%
 
72643.2%
 
62623.1%
 
92352.8%
 
82272.7%
 
111992.4%
 
Other values (507)528263.2%
 
ValueCountFrequency (%) 
13163.8%
 
25486.6%
 
34074.9%
 
43143.8%
 
53053.6%
 
ValueCountFrequency (%) 
82531< 0.1%
 
40241< 0.1%
 
35521< 0.1%
 
32771< 0.1%
 
31371< 0.1%
 

Critic_Score
Real number (ℝ≥0)

MISSING

Distinct78
Distinct (%)2.0%
Missing4383
Missing (%)52.4%
Infinite0
Infinite (%)0.0%
Mean69.18762575
Minimum19
Maximum98
Zeros0
Zeros (%)0.0%
Memory size65.3 KiB
2020-12-05T17:20:01.234119image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum19
5-th percentile43.75
Q161
median71
Q379
95-th percentile89
Maximum98
Range79
Interquartile range (IQR)18

Descriptive statistics

Standard deviation13.75648075
Coefficient of variation (CV)0.1988286286
Kurtosis0.09917993453
Mean69.18762575
Median Absolute Deviation (MAD)9
Skewness-0.5712648591
Sum275090
Variance189.2407626
MonotocityNot monotonic
2020-12-05T17:20:01.451314image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
711371.6%
 
701281.5%
 
801201.4%
 
691201.4%
 
741181.4%
 
731181.4%
 
721181.4%
 
751161.4%
 
681161.4%
 
661141.4%
 
Other values (68)277133.1%
 
(Missing)438352.4%
 
ValueCountFrequency (%) 
191< 0.1%
 
202< 0.1%
 
231< 0.1%
 
242< 0.1%
 
252< 0.1%
 
ValueCountFrequency (%) 
982< 0.1%
 
97100.1%
 
96140.2%
 
95110.1%
 
94180.2%
 

Critic_Count
Real number (ℝ≥0)

MISSING

Distinct105
Distinct (%)2.6%
Missing4383
Missing (%)52.4%
Infinite0
Infinite (%)0.0%
Mean28.53998994
Minimum4
Maximum113
Zeros0
Zeros (%)0.0%
Memory size65.3 KiB
2020-12-05T17:20:01.732402image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile5
Q112
median24
Q340
95-th percentile71
Maximum113
Range109
Interquartile range (IQR)28

Descriptive statistics

Standard deviation20.42759043
Coefficient of variation (CV)0.7157532456
Kurtosis0.7323293219
Mean28.53998994
Median Absolute Deviation (MAD)13
Skewness1.068743552
Sum113475
Variance417.2864507
MonotocityNot monotonic
2020-12-05T17:20:02.121738image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
41431.7%
 
51351.6%
 
121131.4%
 
171131.4%
 
71081.3%
 
61051.3%
 
91041.2%
 
101021.2%
 
81011.2%
 
18991.2%
 
Other values (95)285334.1%
 
(Missing)438352.4%
 
ValueCountFrequency (%) 
41431.7%
 
51351.6%
 
61051.3%
 
71081.3%
 
81011.2%
 
ValueCountFrequency (%) 
1131< 0.1%
 
1071< 0.1%
 
1061< 0.1%
 
1051< 0.1%
 
1041< 0.1%
 

User_Score
Categorical

HIGH CARDINALITY
MISSING

Distinct88
Distinct (%)1.8%
Missing3528
Missing (%)42.2%
Memory size65.3 KiB
tbd
1132 
8
 
165
8.2
 
160
7.8
 
155
8.3
 
137
Other values (83)
3082 
ValueCountFrequency (%) 
tbd113213.5%
 
81652.0%
 
8.21601.9%
 
7.81551.9%
 
8.31371.6%
 
8.11361.6%
 
8.51301.6%
 
7.91261.5%
 
7.51221.5%
 
7.41151.4%
 
Other values (78)245329.3%
 
(Missing)352842.2%
 
2020-12-05T17:20:02.569246image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique10 ?
Unique (%)0.2%
2020-12-05T17:20:02.915447image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length2.889699725
Min length1

User_Count
Real number (ℝ≥0)

MISSING

Distinct641
Distinct (%)17.3%
Missing4660
Missing (%)55.7%
Infinite0
Infinite (%)0.0%
Mean180.2625034
Minimum4
Maximum9851
Zeros0
Zeros (%)0.0%
Memory size65.3 KiB
2020-12-05T17:20:03.201097image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile5
Q111
median28
Q3100
95-th percentile866.2
Maximum9851
Range9847
Interquartile range (IQR)89

Descriptive statistics

Standard deviation576.9884653
Coefficient of variation (CV)3.200823546
Kurtosis94.0185863
Mean180.2625034
Median Absolute Deviation (MAD)21
Skewness8.255115118
Sum666791
Variance332915.6891
MonotocityNot monotonic
2020-12-05T17:20:03.415550image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
41591.9%
 
61561.9%
 
51561.9%
 
81341.6%
 
71111.3%
 
91011.2%
 
10931.1%
 
11921.1%
 
12730.9%
 
13730.9%
 
Other values (631)255130.5%
 
(Missing)466055.7%
 
ValueCountFrequency (%) 
41591.9%
 
51561.9%
 
61561.9%
 
71111.3%
 
81341.6%
 
ValueCountFrequency (%) 
98511< 0.1%
 
90731< 0.1%
 
86651< 0.1%
 
80031< 0.1%
 
75121< 0.1%
 

Developer
Categorical

HIGH CARDINALITY
MISSING

Distinct1126
Distinct (%)23.1%
Missing3489
Missing (%)41.7%
Memory size65.3 KiB
Capcom
 
123
Visual Concepts
 
98
TT Games
 
72
Nintendo
 
72
THQ
 
69
Other values (1121)
4436 
ValueCountFrequency (%) 
Capcom1231.5%
 
Visual Concepts981.2%
 
TT Games720.9%
 
Nintendo720.9%
 
THQ690.8%
 
Omega Force660.8%
 
Traveller's Tales600.7%
 
Yuke's540.6%
 
High Voltage Software460.6%
 
Square Enix460.6%
 
Other values (1116)416449.8%
 
(Missing)348941.7%
 
2020-12-05T17:20:03.651749image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique478 ?
Unique (%)9.8%
2020-12-05T17:20:03.894924image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length80
Median length6
Mean length9.198349085
Min length2

Rating
Categorical

MISSING

Distinct8
Distinct (%)0.2%
Missing3561
Missing (%)42.6%
Memory size65.3 KiB
E
1880 
T
1404 
M
772 
E10+
731 
EC
 
8
Other values (3)
 
3
ValueCountFrequency (%) 
E188022.5%
 
T140416.8%
 
M7729.2%
 
E10+7318.7%
 
EC80.1%
 
AO1< 0.1%
 
K-A1< 0.1%
 
RP1< 0.1%
 
(Missing)356142.6%
 
2020-12-05T17:20:04.200368image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique3 ?
Unique (%)0.1%
2020-12-05T17:20:04.350980image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:20:04.574610image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length4
Median length3
Mean length2.115803326
Min length1

Interactions

2020-12-05T17:19:35.090786image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:35.298153image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:35.468741image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:35.660122image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:35.876683image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:36.097654image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:36.301069image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:36.587802image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:36.798299image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:36.992697image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:37.164498image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:37.417512image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:37.709752image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:37.912998image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:38.194774image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:38.467323image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:38.761498image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:39.074939image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:39.405199image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:39.712944image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:40.036731image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:40.364545image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:40.764102image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:41.104905image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:41.430103image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:41.635354image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:41.836047image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:42.023596image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:42.281465image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:43.028341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:43.372480image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:43.591944image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:43.770513image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:43.941014image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:44.127802image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:44.309380image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:44.490755image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:44.681819image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:44.843563image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:45.027799image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:45.265324image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:45.451531image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:45.670496image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:45.960522image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:46.154746image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:46.337536image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:46.501290image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:46.656091image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:46.822456image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:46.987509image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:47.149857image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:47.311307image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:47.491310image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:47.651752image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:47.817569image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:48.023201image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:48.215076image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:48.432431image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:48.640584image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:48.842429image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:49.070552image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:49.312145image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:49.538836image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:49.770813image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:50.007637image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:50.427181image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:50.627710image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:50.842835image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:51.009815image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:51.190504image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:51.435640image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:51.617516image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:51.793910image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:52.024138image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:52.329671image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:52.562989image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:52.803245image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:53.069107image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:53.362900image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:53.704781image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:53.950734image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2020-12-05T17:20:04.776920image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-12-05T17:20:05.106477image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-12-05T17:20:05.443366image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-12-05T17:20:05.750373image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-12-05T17:20:06.383582image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-12-05T17:19:54.412454image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:54.940153image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:55.387522image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-12-05T17:19:55.717105image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

NamePlatformYear_of_ReleaseGenrePublisherNA_SalesEU_SalesJP_SalesOther_SalesGlobal_SalesCritic_ScoreCritic_CountUser_ScoreUser_CountDeveloperRating
0LEGO Batman: The VideogameWiiNaNActionWarner Bros. Interactive Entertainment1809702830674.017.07.922.0Traveller's TalesE10+
1LEGO Indiana Jones: The Original AdventuresWiiNaNActionLucasArts1516102123478.022.06.628.0Traveller's TalesE10+
2LEGO Batman: The VideogamePSPNaNActionWarner Bros. Interactive Entertainment564402712873.05.07.410.0Traveller's TalesE10+
3Combat2600NaNActionAtari117701125NaNNaNNaNNaNNaNNaN
4LEGO Harry Potter: Years 5-7WiiNaNActionWarner Bros. Interactive Entertainment694201212476.08.07.813.0Traveller's TalesE10+
5LEGO Harry Potter: Years 5-7X360NaNActionWarner Bros. Interactive Entertainment5137099777.035.07.939.0Traveller's TalesE10+
6Yakuza 4PS3NaNActionSega15136359578.059.08177.0Ryu ga Gotoku StudiosM
7LEGO Harry Potter: Years 5-7PS3NaNActionWarner Bros. Interactive Entertainment36410159176.027.08.348.0Traveller's TalesE10+
8The Lord of the Rings: War in the NorthX360NaNActionWarner Bros. Interactive Entertainment5224088461.048.07.4113.0Snowblind StudiosM
9The Lord of the Rings: War in the NorthPS3NaNActionWarner Bros. Interactive Entertainment25421138263.033.07100.0Snowblind StudiosM

Last rows

NamePlatformYear_of_ReleaseGenrePublisherNA_SalesEU_SalesJP_SalesOther_SalesGlobal_SalesCritic_ScoreCritic_CountUser_ScoreUser_CountDeveloperRating
8349XCOM 2PS42016.0StrategyTake-Two Interactive48021488.028.08116.0Firaxis GamesT
8350Total War: WARHAMMERPC2016.0StrategySega012011386.077.07.3556.0Creative AssemblyT
8351Culdcept Revolt3DS2016.0StrategyNintendo00606NaNNaNNaNNaNNaNNaN
8352Hearts of Iron IVPC2016.0StrategyParadox Interactive0500583.036.06.9306.0Paradox Development StudioNaN
8353XCOM 2XOne2016.0StrategyTake-Two Interactive2200587.017.08.140.0Firaxis GamesT
8354StellarisPC2016.0StrategyParadox Interactive0400478.057.08569.0Paradox Development StudioNaN
8355Total War Attila: Tyrants & KingsPC2016.0StrategyKoch Media01001NaNNaNNaNNaNNaNNaN
8356Brothers Conflict: Precious BabyPSV2017.0ActionIdea Factory00101NaNNaNNaNNaNNaNNaN
8357Phantasy Star Online 2 Episode 4: Deluxe PackagePS42017.0Role-PlayingSega00404NaNNaNNaNNaNNaNNaN
8358Phantasy Star Online 2 Episode 4: Deluxe PackagePSV2017.0Role-PlayingSega00101NaNNaNNaNNaNNaNNaN